Shrinking L1 Instruction Caches to Improve Energy-Delay in SMT Embedded Processors
نویسندگان
چکیده
Instruction caches are responsible for a high percentage of the chip energy consumption, becoming a critical issue for battery-powered embedded devices. We can potentially reduce the energy consumption of the first level instruction cache (L1-I) by decreasing its size and associativity. However, demanding applications may suffer a dramatic performance degradation, specially in superscalar multi-threaded processors, where, in each cycle, multiple threads access the L1-I to fetch instructions. We introduce iLP-NUCA (Instruction Light Power NUCA), a new instruction cache that substitutes the conventional L2, improving the Energy-Delay of the system. iLP-NUCA adds a new tree-based transport network topology that reduces latency and energy consumption, regarding former LP-NUCA implementations. With iLP-NUCA we reduce the size of the L1-I outperforming conventional cache hierarchies, and reducing the overall consumption, independently of the number of threads.
منابع مشابه
One-Level Cache Memory Design for Scalable SMT Architectures
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past decade. Instead, larger unified L2 and L3 caches were introduced. This cache hierarchy has a high overhead due to the principle of containment, as all the cache blocks in the upper level caches are contained in the lower...
متن کاملReducing Energy in Instruction Caches by Using Multiple Line Buffers with Prediction
Energy efficiency plays a crucial role in the design of embedded processors especially for portable devices with its limited energy source in the form of batteries. Since memory access (either cache or main memory) consumes a significant portion of the energy of a processor, the design of fast low-energy caches has become a very important aspect of modern processor design. In this paper, we pre...
متن کاملMemory Optimizations of Embedded Applications for Energy Efficiency a Dissertation Submitted to the Department of Electrical Engineering and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
The current embedded processors often do not satisfy increasingly demanding computation requirements of embedded applications within acceptable energy efficiency, whereas application-specific integrated circuits require excessive design costs. In the Stanford Elm project, it was identified that instruction and data delivery, not computation, dominate the energy consumption of embedded processor...
متن کاملSoftware instruction caching
As microprocessor complexities and costs skyrocket, designers are looking for ways to simplify their designs to reduce costs, improve energy efficiency, or squeeze more computational elements on each chip. This is particularly true for the embedded domain where cost and energy consumption are paramount. Software instruction caches have the potential to provide the required performance while usi...
متن کاملReducing cache and TLB power by exploiting memory region and privilege level semantics
1383-7621/$ see front matter 2013 Elsevier B.V. A http://dx.doi.org/10.1016/j.sysarc.2013.04.002 q This is an extension of paper ‘‘Reducing L1 Caches Semantics’’, published in International Symposium o Design (ISLPED), July 30 – August 1, 2012. Compared material in this sumission mainly includes: optimiz detailed analysis on the counter-intuitive phenomen by kernel code; detailed analysis on ca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013